lilillllllliiliiiiillii 

US006377983B1 

(12) United States Patent (lo) Patent No.: us 6,377,983 Bl 

Cohen et al. (45) Date of Patent: Apr. 23, 2002 



(54) METHOD AND SYSTEM FOR CONVERTING 
EXPERTISE BASED ON DOCUMENT USAGE 

(75) Inventors: Andrew L. Cohen, Brookline, MA 
(US); Paul P. Maglio, Santa Cruz; 
Robert C. Barrett, Sunnyvale, both of 
CA (US); Marie A. Sheldon, Arlington, 
MA (US) 

(73) Assignee: Internationa] Business Machines 
Corporation, Armonk, NY (US) 

( * ) Notice: Subject to any disclaimer, the term of this 
patent is extended or adjusted under 35 
U.S.C. 154(b) by 0 days. 

(21) Appl. No.: 09/192,047 

(22) Filed: Nov. 13, 1998 

Related U^. Application Data 

(60) Provisional application No. 60/098,568, filed on Aug. 31, 
1998. 



(51) 
(52) 

(58) 



(56) 



Int. Cl7 G06F 15/16 

U.S. CI 709/217; 709/219; 709/209; 

709/223; 707/4; 701A 

Field of Search 709/217, 219, 

709/209, 223, 249, 200, 203, 201; 707/4; 

701/1 

References Cited 
U.S. PATENT DOCUMENTS 



5,530,852 A 
5,708,806 A 
5,781,785 A 
5,784,568 A 
5,787,274 A 
5,793,365 
5,796,393 
5,813,007 
5,815,830 
5,819,258 
5,857,179 A 
5,911,140 A 



6/1996 
1/1998 
7/1998 
7/1998 
7/1998 
8/1998 
8/1998 
9/1998 
9/1998 
10/1998 
1/1999 
6/1999 



Meske, Jr. et al. 
DcRosc ct al. 
Rowe et al. 
Needham 
Agrawal et al. 
Tang et al. 
MacNaughton et al. 
Nielsen 
Anthony 

Vaitby ana than et al. 
Vaithyanathan et al. 
Tukey et al. 



5,918,237 
5,923,845 
5,951,641 
5,956,509 
5,987,503 
6,014,136 



6/1999 
7/1999 
9/1999 
9/1999 
11/1999 
1/2000 



Montalbano 
Kamiya et al. ... 
Menard ct al. ... 
Kcvncr 
Murakami 
Ogasawara et al. 



379/93.15 
... 709/217 



(List continued on next page.) 

OTHER PUBUCAnONS 

Allcrton 96, Bruce L. Lambert, Content Analysis Via Docu- 
ment Clustering, http://cdfu.lis.uiuc.edu/allcrton/96/lambcr- 
t.html, 1 page. 

(List continued on next page.) 

Primary Examiner — Ayaz Sheikh 
Assistant Examiner — Firmin Backer 

(74) Attorney, Agent, or Firm — Brown Raysman Millstein 
Feldcr & Steiner LLP 



(57) 



ABSTRACT 



The invention disclosed herein relates to cooperative com- 
puting environments and information retrieval and manage- 
ment methods and systems. More particularly, the present 
invention relates to methods and systems for capturing and 
generating useful information about a user's access and use 
of data on a computer system, such as in the form of 
documents stored on remote servers, and making such useful 
information available to others. Documents on the computer 
system are accessible through a plurality of different 
methods, such as by specifying an identifier or locator for 
the document, activating a hyperlink in another document 
which points to the document, or navigating to the document 
through navigational commands in an application program 
such as a browser. The method involves capturing informa- 
tion regarding each of the accessed documents in the set, the 
information including the method used to access the 
document, dividing the set of documents into subsets of 
documents based at least in part on the methods used to 
access the documents, labeling each subset of documents 
with a topic, and making the labels and documents accessed 
available to other users who wish to browse the same 
documents. 

49 Claims, 8 Drawing Sheets 
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METHOD AND SYSTEM FOR CONVERTING gational commands offered by the user's web browser 

EXPERTISE BASED ON DOCUMENT USAGE program, such as the BACK and FORWARD commands and 

the history or GO list to view documents previously 

RELATED APPU CATIONS accessed, and the HOME command to navigate to a home 

5 page in relation to a particular page found. 

Hiis apphcation is related to and claims the benefit of jf ^^^^^^^ information is not found after a while, the 

provisional application senal No. 60A)98,568, tilled THE frequendy restarts the search process by jumping to a 

EXPERTISE BROWSER: HOW TO LEVERAGE DIS- new, unrelated resource such as the original or another 

TRIBUTED ORGANIZAHONAL KNOWLEDGE, filed search engine, an index file, or a known document vdiich 

Aug. 31, 1998, which is hereby incorporated by reference may have helped the user in the past in related searches. This 

into this application. jump is usually performed by manual entry of the address of 

This application is related to commonly owned applica- the new resource, such as the uniform resource locator 

tion Scr. No. 09/143,075, tided METHOD AND SYSTEM (URL) in the case of the web. Alternatively, if the user 

FOR INFORMING USERS OF SUBJECTS OF DISCUS- previously visited the resource and stored its URL as a 

SION IN ON-LINE CHATS, filed Aug. 28, 1998, which is ^5 bookmark on the browser, the user can jump to the new 

hereby incorporated by reference into this application. resource by selecting the bookmark. Of course, the user may 

This application is related to commonly owned applica- distracted during the search process by a hyperlink to 

tion Ser. No. 09/191,587, tided METHOD AND SYSTEM ^^^^^ document which is completely unrelated to the 

FOR SUMMARIZING TOPICS OF DOCUMENTS °^ ^^^^^ ^ ^^^^^ advertisement to 

BROWSED BY A USER, filed Nov. 13, 1998, which is 20 ^"^^""^ information before returning to the thread of 



hereby incorporated by reference into this application. 



the search. 

Thus, by the time a user finds a number of documents 

COPYRIGHT NOTICE which contain the desired information, the search process 

^ . . , has likely led the user through a path of numerous docu- 

A portion of the disclosure of this patent document ^ents accessed in many different ways depending upon the 

contains matenal which is subject to copyright pmleclion. 25 ^g^^.g j^d ^ ,3 to which way would bring the user closer 

The copyright owner has no objection to the facsimile ^^^^^^ ^^^^^^ 

reproduction by anyone of the patent documentor the patent . j j j <^ -i ^l- 

J .'^ , • Ti . * J T -1 1 rt«= Havmg now expended time and effort to compile this 

disclosure, as it appears m the Patent and Trademark Office Aitr^ *4U • * , * ^ ^ 

. » ci J u . »u - 11 • u* useful set of documents, the user is apt to want to capture 

patent files or records, but otherwise reserves all copyright . . . *u r *r , , , 1, r 

^ . . . . this set both for the user s own later use as well as for use 
ri£nts wtiatsoever 

^ by others. Several software programs allow users to store a 

BACKGROUND OF THE INVENTION ^ series of documents as the user browses the 

documents. However, this path wiU likely include a number 

The invention diseased herein relates to cooperative of documents which arc unrelated to the search process or 

computing environments and information retrieval and man- are otherwise unhelpful, as explained above. Those pro- 

agement methods and systems. More particularly, the gfa^s that allow users to edit their paths still require 

present invention relates to methods and systems for cap- substantial manual effort and judgment on the part of the 

turing and generating useful information about a user's user. Moreover, other users have no way of finding paths or 

access and use of data on a computer system, such as in the sequences of documents which relate to specific topics or 

form of documents stored on remote servers, and making ^ which were created by specific users or by users with 

such useful information available to others. specific areas of expertise. Later users thus can not take 

Computer systems such as organizational networks, data- advantage of the time and expertise of the first user in 

base systems and the Internet, provide a wealth of informa- performing the search and browsing through numerous 

tion to users. However, users must know how to find the documents to find those that are truly relevant and helpful, 

information they want. Indeed, searching for specific infor- 45 There is therefore a need for powerful tools and methods 

mation on a desired subject of interest is often a difficult and that capture a user's browsing history and automatically 

tedious process that is usually aided by the user's existing generate a set of useful documents and resources from this 

knowledge of or expertise in the subject. This is particularly history for the user's later use as well as use by others. 

true in the relatively unstructured environment of the Inter- ^„ ,.x, r^^™ ^ 

^ SUMMARY OF THE INVENTION 

Del- 50 

Using the world wide web, for example, a user might is an object of the present invention to solve the 

begin a search for desired information by entering a key- problems described above with existing document browsing 

word query through a search engine, and then follow hyper- systems. 

finks contained in the web documents to move fi-om one It is another object of the present invention to allow a 

document to another until the desired information is fotmd. 55 hroad range of users to obtain the benefit of the expertise of 

Since keyword searches are typically unreliable and do not experts as expressed through the experts' access and use of 

immediately produce directly relevant results, users are documents. 

often required to browse through a number of documents It is another object of the present invention to automati- 

until some directly relevant information is found. Expertise cally parse document browser trails or paths into sequences 

in a subject usually helps users formulate better keyword go documents which are related by a common topic, 

searches and recognize the relevance of the various results It is another object of the present invention to facilitate the 

found. use of the distributed expertise within an organization by 

Moreover, particular documents usually provide only part making available traces of experts' browsing and searching 

of the specific information desired, and thus users must often behavior. 

access a number of such documents until a complete set of 65 It is another object of the present invention to help users 

useful information is compiled from the various documents. find documents that someone with expertise in a particular 

During this process, users also make frequent use of navi- field has already read. 
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It is another object of the present invention to account for create documents, including queries and results. The system 

a user's method of accessing documents in determining how provides three ways to leverage expert browse paths: 

to group together sets of related documents. i. If the second user knows a particular person with 

Some of the above and other objects are achieved by a relevant expertise, the second user can explicitly 

method and system for conveying expertise of a first user of ^ specify that the system should provide access to that 

a computer system to one or more second users of the person's browse paths that are relevant to the query, 

computer system. The computer system stores a plurality of 2. The second user can ask for the system to identify a 

documents usable by the first and second users. The system ranked list of those with knowledge that appears rel- 

may be a single computer, such as a personal computer evant to current information needs and proceed as in 

accessible by a number of xisers, or may be a network of 10 fiy 

computers arranged in a cUent/server or related architecture. 3 -j^e second user can ask the system to return browse 

In a network environment, the documents would be stored ^^^^ relevant to a specific query, sorted 

on a server and usable by the first and second users through ^^^^^ ^^^^ ^^^^^ ^^^^^ a particular 

requests for the documents from individual client comput- expert's paths or can browse all of the paths. 

Some of the above and other objects are achieved by a 

The method involves capturing information regarding a method for conveying expertise of a plurality of first users 

sequence of documents used by the first user on the com- of a computer system to one or more second users of the 

putcr system and associating a plurality of content areas with computer system, the computer system storing a plurality of 

a plurality of sets of documents in the sequence based on the documents usable by the first and second users. The method 

captured information. The second user is then allowed to involves capturing information regarding a sequence of 

select one of the content areas, and the set of documents doctmients used by each of the first users on the computer 

associated with the selected content area is provided to the system, storing the captured information regarding each 

second user, possibly in the same sequence in which the sequence of documenu in association with the first user, 

documents were browsed by the first user. allowing a second user to select one of the first users, and 

The first user's information may be captured by monitor- providing the sequence of documents associated with a 

ing the first user's activities and storing the desired infor- selected first user to the second user. 

mation in a log file. The desired information includes an 

identiflerofeachdocumenUsuchasaURLinthecaseofthe ^^^^ DESCRIPTION OF THE DRAWINGS 

Web, or database names and fields in the case of documents ^ The invention is illustrated in the figures of the accom- 

stored in server databases such as in the LOTUS NOTES i panying drawings which are meant to be exemplary and not 

groupware system available from Lotus Development Cor- hmiting, in which like references refer to like or correspond- 

poration of Cambridge, Mass. In some embodiments, the ing parts, and in which: 

desired information which is captured for each document piQ 1 is a diagram of a system for capturing and 

also includes the method by which the document was conveying expertise in document usage in accordance with 

accessed by the first user, e.g., by direct entry of the one embodiment of the present invention; 

identifier such as the URL, selection of a stored bookmark, ^ is flow chart showing a process of capturing and 

activation of a hyperlink from a prior document or from an conveying expertise in document usage using the system of 

external application such as electronic mail, or by navigaUon pj^. j ^ accordance with one embodiment of the present 

to the document through browser commands. The particular ^ invention* 

method used to access the document may then be considered ^ n 1 . 1 • r . • 1 

. J . . ■ 1 J- • .1. ^ FIG. 3 IS a now chart showing a process of capturing and 

m the decision process involved m parsing the entire docu- . , . 

tnent browse sequence into subsequences or clusters, on the P"^"^ expert s use of documents on the world wide web 

theory that the method used to access the document reflects accordance with one embodunent of the present mven- 

the first user's intention to continue along the same content - 

trail, such as by activating a hyperlink, or to begin a new trail 4 is a flow chart showing a process of allowing users 

such as by selecting a stored bookmark. ^ ^^'""^ ^"^^^ ^ ^^P^''^^^ document sequences by 

J ..j.-i .J- specifying a topic and/or expert in accordance with one 

In some embodiments, the method is implemented in one embodiment of the present invenUon; 

or more software programs stored on one or more computer ^ . „ , . . ^ • 

readable media, such as magnetic or optical data storage 50 . ^ ^ ' flow chart showing a process of parsmg a 

devices, and executing on one or more computers to thereby document usage trail m accordance with one embodiment of 

cause the computers to perform the method. When used in P'®^°^ mvenUon; 

a network environment, the method may be performed by F'G. 6 is a flow chart showing the parsing process of FIG. 

two or more distinct computer programs residing on separate 5 applying a first set of heiiristics in accordance with one 

computers. For example, the step of capturing information 55 embodiment of the present invention; and 

regarding the first user's activities may be performed by a FIGS. 7A-7B contain a flow chart showing the parsing 

program residing on a client computer used by the first user, process of FIG. 5 applying a second set of heuristics in 

while the step of allowing second users to select a content accordance with one embodiment of the present invention, 
area or label may be performed by a second program 

executing on a server centrally accessible by the flrst and qq 
second users. 

In one embodiment, the method involves storing the The preferred embodiments of a system, method, and 

information -browsing paths and patterns of content experts, article of manufacture containing software programs in 

and then using paths and patterns to help others who are accordance with the present invention is described with 

browsing or searching in similar areas. More precisely, a 65 reference to the drawings in FIGS. 1-7. 

browse path is the trail of documents opened or any other Referring to FIG. 1, one embodiment of the system 10 of 

informational resources used to locate, view, navigate, or the present invention includes a pluraUty of computer work- 



DETAILED DESCRIPTION OF THE 
PREFERRED EMBODIMENTS 
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Stations 12 connected to a network 14, such as the Internet taken in LOTUS NOTES. Notes supports hyperlinks in the 

or other interact or intranet, and a monitoring server 16 form of document links and database links which take the 

conncctable to the workstations 12 directly or over the user to a particular database without picking a doomient. 

Internet 14. As shown in one exemplary case in FIG. 1, the Users can change windows in Notes, open databases 

workstations 12 20 have a client application program 18 5 manually, which is the equivalent of typing in a URL, pick 

executing thereon which is capable of accessing, retrieving an icon, or bookmark, on the desktop, and perform other 

and using documents available from servers on the Internet operations or actions. These actions are captured as addi- 

14, from the monitoring server 16, or from other worksla- tional information. 

tions 12. In the case of the world wide web, the client At the end of a document browsing session, such as at the 

application 18 is a web browser program such as end of a day or when the user terminates the client appli- 

NETSCAPE NAVIGATOR or INTERNET EXPLORER cation 18, the usage traU is ended. Additional usage trails 

which communicates with the servers on the web via HTTP. may be established in the log file if the user initiates another 

Alternatively, the client application 18 may be a LOTUS browsing session. Because trails arc meant ultimately to be 

NOTES client application which communicates and shared, they arc stored in a central location to which all users 

exchanges data with LOTUS NOTES databases operating or clients have access. Thus, at certain points, the log file is 

on the servers. As one skilled in the art will recognize, many transmitted from the workstation 12 to the monitoring 

conventional client applications for network enviroimients server. The monitoring server 16 contains conventional 

may be used without departing from the scope of the present computer hardware elements including a processor 22 and 

invention. memory devices 24 including a RAM, ROM, hard disk, and 

In particular embodiments, the workstations 12 also have 20 othei magnetic or optical disk drives, 

a logger program 20 operating thereon which logs the The monitoring server 16 further contains a number of 

documents accessed and used by the client application 18 program modules for analyzing the user's log. These pro- 

and stores this logged or captured information in a log file. gram modules include a pre-prooessor 26, a parser 28, and 

The logger program 20 may be built into the client appli- a labeler 30, The pre-processor 26 prepares a usage trail for 

cation 18, or may be a plug-in program or operating system- 25 parsing, and the parser 28 breaks the trail into content areas 

dependent program that spies on low level system events, in accordance with processes described herein. The labeler 

such as a DDE, as known to those of skill in the art. 30 assigns labels or topics to the content areas, which topics 

Alternatively, the logger program 20 is a programmable are then arranged in a table of contents 32 which associates 

intermediary which is programmed to monitor the use of the the various topics with the users from whose usage trails 

client application. An example of such an intermediary is 30 ^^^^ derived. The list of documents associated with 

web browser intelligence or WBI client as described in each topic is stored in the table of contents 32 or a separate 

Barrett, R., Maglio, P. P., & Kellcm, D. C. How to Person- relational table 34, so that they can be retrieved for presen- 

alize the Web, Proceedings of Human Factors in Computing tation to other users as described herein. 

Systems, CHI '97. (1997), New York: ACM Press, and In particular embodiments, this central repository is 

Barrett, R. & Maglio, P. P., Intermediaries: New places for 35 implemented as a WBI server connected to a simple data- 

producing and manipulating web content , Proceedings of base or file system. Thus, the client-side WBI monitors 

Seventh International World Wide Web Conference. URLs viewed by a specific individual and sends that infor- 

Brisbane, Australia, 1998, both of which are hereby incor- mation via a simple HTTP request to a central WBI server 

porated by reference into this application, and are available which maintains a database of all users and their trails. As 

for downloading 00 the web at http.// 40 in the case of monitoring, WTBI provides a convenient and 

www.alphaworks.ibm.com. These programmable intcrmedi- platform-independent means for maintaining these data, but 

aries provide a convenient means for monitoring the many other schemes are possible as will be recognized by 

sequence of URLs viewed by a user, as the system is those of skill in the ait, such as a DB2 or LOTUS NOTES 

platform-independent, thus allowing identical code to be database, though these would require different client-server 

used on any kind of computer. 45 protocols. 

The logger program 20 monitors the usage of documents Referring to FIG. 2, one process for sharing expertise 

retrieved by the client application 18 and stores document using the system of FIG. 1 begins when an expert user is 

identifiers which identify the documents used. In the case of browsing through documents, step 50. The documents 

the web, the document identifiers are die URLs for the accessed by the expert are monitored and stored, step 52, to 

document or pseudonyms thereof. In the case of documents 50 create a usage trail. The usage trail is analyzed to automati- 

rctrieved from a LOTUS NOTES database, the document cally determine one or more content areas of the documents, 

identifiers are LOTUS NOTES Universal Identifiers (UIDs) step 54, and the tisage trails and content areas are stored, step 

which name any document in any database by encoding an 56. The process of determining the content areas effectively 

identifier for the NOTES database, including possible rep- cuts the usage trail into content areas and associates the trails 

hca information, and an identifier for the document within ss with people with expertise in that content area. Experts are 

the database. The document identifiers for consecutive docu- persons with peer acknowledged familiarity with a particular 

mcnts accessed in a sequence during a browsing session are content area or as defined by one of the many computer 

added to the log. Ausage trail of all the documents accessed systems designed to track expertise in organizations. A 

or used by the user during a browsing session is thus created variety of methods are possible for cutting or breaking the 

and stored in the log file. In some embodiments, described eo usage trail, including ones that take account of content or 

in greater detail below, additional information is captured by semantics of the documents on the trails, topology or con- 

the logger, including the method used to access the particular nectivity of documents on the trails, or both, as described in 

document and actions taken by the user in the document, the above reference 1998 article by Maglio & Barrett. A 

e.g., search terms input by the user into a search engine. alternative method for meaningfully breaking trails into 

In the case with NOTES UIDs, the additional information 65 content areas based on local connections of trail elements is 

comprises client action information, which identifies the described in Mag ho, R P. & Barrett, R, How to Build 

method used to access particular documents and actions Modeling Agents to Support Web Searchers, Vrocctdin^ of 
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the Sixth International Conference on User Modeling, New agent also generates content areas for each the 

York, 1997, which is hereby incorporated by reference into subsequences, step 94, based upon the contents of the 

this application. Alternative processes for determining the documents. Labels arc generated for each content area, step 
content areas arc described below with reference to FIGS. 

5_7, 5 If the expert does not belong to a group, step 98, the labels 

^ » I • k 1 A .^^r.^ :a^^*'k^a generated for the user's content areas are stored in a database 

Once the usage u^rnl is broken and content areas identified, ^ , _ - . ... l 

^ 1 u 1 J iu . * on the server. If the expert belongs to a group such as a 

Oie poruons of he usage areas are labeled with topics, step ^ ^ organization, step 98, die ^oup dau is 

58 In a simple embodiment, labeling is performed by J^^^,^^ ^rt, step 102, and the expert, group 

selecting the most frequent word or phrase to appear m the ^^^^^ ^^^^ ^ ^^^^/.^ /^^^^^ ^t^p The server 

poruon of the usage trail. Other embodunents of labelmg are 10 ^ ^^^^ ^^^^^ ^ 

used with reference to clustering technology, as described ^j^^^ determination, of the expert's log file may contain data 

below and otherwise known to those of skill m the art identifying which groups, if any, to which the expert 

A subsequent user may access the labels in alternative belongs, 

ways. The user may simply request a hst of all available Referring to FIG. 4, a subsequent user has several options 

labels or topics displayed, step 60, in response to which the finding document content areas for topics of interest. If 

Ust is retrieved from and displayed and the user allowed to j^Q^g t^e identity of an expert, the user can specify 

select one of the labels, step 62. The user may place a query cxpcTi, step 110, and the table of topics is queried for as 

request for a subject matter of interest to the user, step 64, explained above with reference to FIG. 2, the user can 

and the query is executed on the available labels in an display the list of all topics for that expert and make a 

attempt to find a reasonably close match, step 66. Ubels selection therefrom, step 114, or may input a query for topics 

3330/38 which arc possible matches to the query arc dis- ^nd be presented a list of matching topics. If the user knows 

played and the user may make a selection therefrom, step 62. ^ particular expert, the user can specify the 

Once the user selects a label, the portion or subsequence of g^^^p^ ^^^^ ^nd die table of contents is queried for 
the usage trail associated with the label is retrieved and the ^ i^bd^ associated with the experts in that group, step 118. If 

documents in the subsequence provided to the user, step 68, ^oes not know of any particular expert or group in 

possibly in the same order in which the documents were the subject matter of interest to the user, the user can display 

accessed by the expert in die s original browsing session as of topics or perform a query, step 120, as described 

recorded in the usage trail. above. 

Referring to FIGS. 3-4, the process of capturing expertise ^ Ultimately, the user chooses a label or topic which rep- 
information and making it available to others is now resents a trail of documents in the associated content area, 
described in greater detail with respect to the world wide step 122. In response, the server side WBI accesses the topic 
web embodiment described above involving the use of WBI database to locate the U-ail of documents associated with the 
intermediaries on the client workstations and monitoring selected label, step 124. In addition, the server side WBI 
server. As shown in FIG. 3, as the expert browses the web, queries tiie database to determine whether other experts 
step 80, the WBI client checks when a new web page has accessed documents in the located trail and other trails taken 
been accessed, step 82. In this context, a new web page by such experts from the document, step 126. The server 
includes any change in web page, even to a web page the transmits the list of otiier experts, associated documents, and 
expert has previously accessed. When a new page is alternative trails to the client for sequential display, 
accessed, the client WBI agent adds the URL of the web ^ Alternatively, the server provides the list of URLs and other 
page and the method used by the expert to access the web information to die client side WBI agent, which retrieves 
page into the sage trial stored in the log file, step 84, each of the documents in the trail in sequence, step 128. As 
Alternatively, only new web pages not previously accessed a further alternative, the monitoring server retrieves the 
can be added to the usage trail. The method of access used documents from the original server and transmit them to the 
by the expert is stored in die log file for use in parsing the client in sequence. The user is allowed to interact with each 
usage trail. The various methods of accessing web docu- document provided, step 130, and to issue commands 
ments are well known, and include those set forth above whether to proceed with the other documents in the trail or 
such as input of a URL, selection of a bookmark, activation pursue another trail followed by one of the otiier identified 
of a hyperlink from another document or another experts. 

application, and browser navigation- -j^c foUowing exemplary series of scenarios assist in the 

The chent side WBI agent also sends the document to the understanding of the operation of the user selection options 

monitoring server, step 86, for analysis of its content during described herein. The exemplary situation is a financial 

parsing. Alternatively, the client side WBI agent can send the considting scenario. 

URL of each document to the monitoring server, which Sara is a tax expert at a major financial consulting firm, 
server then retrieves the document directly from its original 55 she works in one of fifteen groups each consisting of ten tax 

server. consultants. Each group has a geographic specialty and each 

When the expert is done browsing, step 88, as determined individual within each group has expertise in a particular 

by, e.g., termination of the browser program, or otherwise at area of tax law. Sara has a cfient who is a resident of Sweden, 

scheduled times or events, the log file generated by the client but who is a U.S. citizen. In addition, this chent's family 
side WBI agent and representing the captured information of 60 (husband and children) continues to live in San Francisco 

expert activity is transmitted to the server side WBI, step 90. where they own residential and income property. Thus, she 

In alternative embodiments, the captured information is sent has to file personal income tax forms in both countries, 

to the server as each new web page is accessed. The server However, Sara has joined the San Francisco group recentiy 

side WBI agent parses the usage trail into subtrails or and she has not yet gained expertise in Swedish Tax law. She 
subsequences based on the captured informaUon, step 92, 65 needs to leverage the expertise of her counterparts in the 

which may include a combination of document content and group who arc specialists in Swedish and international tax 

user actions in accessing the document. The server WBI law. 
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Scenario 1: When the User Knows the Identity of the As explained above, the parsing of the usage trail may be 

Expert. Sara wants to know how to accxjunl for US rental accomplished in a number of known ways, depending upon 

property income on the Swedish income tax forms and the results desired. Other methodologies which provide 

remembers meeting Sven Jorgensen at a company meeting. improved results over known methods arc now described 

She would like to capitalize on his expertise in this area. Sara 5 with reference to FIGS. 5-7B. These parsing methodologies 

performs the following steps: are applicable to the expertise sharing methodology 

SaraselectsSven, an the expert whose browsing paths she described above. They are also generally applicable to 

woiild like to query. parsing of a user's browsing history for the user's own use, 

Sara begins to browse through his paths by typing in for example, to create an index or table of contents of the 

tenlativequery terms, such as "U.S. rental income". As user's own browsing activities so the user can retrace his 

she types in the query, relevant browse paths created by ^^^^ -j^^^ jj^^ parsing methodologies described below may 

Svcn appear on the screen. implemented in stand-alone programs residing and 

Sara selects one of the paths and reviews the browse executing on the user's own computer rather than on a 

histories and documents until she finds the document remotely located server. 

that helps solve her problem. 15 Referring to FIG. 5, the basic paising method starts when 

Scenario 2: When the User lOiows the Group. Sara wants ^j^^ ^^^^^^ ^^^^^ documents, step 150, and a usage 

to know how to account for U.S. rental property mcome on ^^^^^^ .^^^^^^^^ 

the Swedish ]iicx>ine tax forms and vaguely remembers H ,? j • -.i. .u 

meeting a group of experts in US-Swelsh tax law at a V'*^ as user actions madc in comiection with the 

company meeting. Unfortunately, she does doI remember documents, step 152. Three user actions mchide the method 

any of their names. Sara needs to discover the experts in the ^ employed to access the document as explained above, and 

area. She types in her query, which is performed on an 'n«='"d« ^""^ ^ '^^^ "IP"' 

indexed set of documents contained in the experts browse '^'^ interactmg with the document. The actions logged are 

paths. What is returned is a set of experts and a sub-set of ^ help determine likely or defimUve break points m the 

each expert's browse paths that match the query. She usage trail, step 154. DifiFerent heuristics may be apphed to 

requests the list of content area experts. 25 partiUon the usage trail based on the user actions, and 

From this list, Sara thinks she remembers Sven Jorgensen V'^f' '^^'^^^ "^^^"^ '^^ '^f^^'"* '° ^I^^- * 

and Ben Hogan as the experts she met, and selects their \ \ . , . • . i • , 

paths as ones she would like to browse. , Standard document clustering techniques are employedto 

^ . . .i.!.. -. determme content areas withm the usage trail, step 156, The 

Sara begms to browse through these paths by typing in ^ ^ techniques may be any clustering algorithm 

tentative queries such as U.S. rental mcome . As she • i j- i u »i. i i * • 

. 7 , , L .1 ^ . mcluding conventional ones such as the k-means clustering 

types m the queries, paths created by the experts that , j l j • r n j v t» ^ 

^ algorithm described m L. Bottou and Y. Bengio, Conver- 

are related her query appear on the screen. ^ „ v \£ a i a a 

7 . gence Pmperties of the K-Meam Algorithm, m Advances m 

Sara selects one of the paths and reviews documents until ^^^^i Information Processing Systems 7, pages 585-592 
she finds the document that helps solve her problem. 35 p^jy ^995^ ^^^^^ ^^^^^^ incorporated by refer- 
Scenario 3: When the User Needs to Identify an Expert. ^^^^ -^^^ appUcation. Several examples of additional 
Sara wants to know how to account for U S. rental property document clustering algorithms are described in the follow- 
mcome on Swedish mcome tax forms. Unfortunately, she - documents, which are hereby incorporated by ref- 
has 00 idea who in her company might know relevant ^^^^^ application: Douglas R. Cutting, David R. 
information m this area. ^ ^^^^^ q Pedersen, John W. Tukey, Scatter/Gather: A 
Once again Sara needs to discover the experts in the area. Cluster-based Approach to Browsing Large Document Col- 
She types in her query that is performed on an indexed lections. In Proceedings of the 15th Annual International 
set of documents contained in the experts' browse ^^M SIGIR Conference. Association for Computing 
paths. What is returned is a set of experts and a sub-set Machinery. New York. June, 1992. Pages 318-329. Gerard 
of each expert's browse paths that match the query. 45 Salton. Introduction to Modern Information Retrieval, 
She requests the list of content area experts. (McGraw-Hill, New York 1983). 

Sara does not recognize any of the experts, so she selects After clustering is completed, the content areas are 

all of the experts. She scans through the paths. At first, labeled using standard labehng techniques, step 158. The 

none seem especially related to her interests. However, labeling of document clusters is known to those of skill in 

after reviewing them carefully, one of the experts' paths 50 the art, and is described for example in pages 314-323 of 

looks interesting. Peter G. Anick and Shivakumar VailhyanaUian, Exploiting 

She selects that expert and requests aU of that person's Clustering and Phrases for Context-based Information 

paths. Sara finds one of the paths and accompanying Retrieval, in Proceedings of the 20th International ACM 

document that helps solve her problem. SIGIR Conference, Association for Computing Machinery, 

In the first two of the scenarios described above, Sara 55 July 1997, which document is hereby incorporated by ref- 

either knows an expert or a group of experts who can likely erence into this application. In the context of the world wide 

answer her questions. In these cases, she is able to leverage web, in hypertext documents, since there is more inform a- 

her tadt knowledge about people and their dififerential tion than just the content of documents, documents can be 

expertise in her organization. She can review traces of the clustered by content, or by analysis of their hyperlink 

documents people have read with a reasonable expectation 60 structure, or both. Combining content and link information 

of finding more useful documents than if she had browsed or into a single clustering algorithm and pm^ely content-based 

searched alone. In the third scenario, Sara has no idea who and purely hnk-based clustering is described in, for example 

in the organization might have the expertise to help her. In "HyPursuit: A Hierarchical Network Search Engine that 

this case, the system of the present invention relies on Exploits Content-Link Hypertext Clustering," Ron Weiss, 

explicit representations of expertise in the form of updated 65 Bicnvenido Velez, Mark A. Sheldon, Chanathip 

profiles and taxonomies of people and their respective Nemprempre, Peter Szilagyi, Andrzcj Duda, and David K. 

expertise. Gifford, Proceedings of the Seventh ACM Conference on 
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Hypertext, Washington, D,C., March 1996 which document are assigned to the tokens, step 212, This process is per- 
is hereby incorporated by reference into this application. formed for all documents in the trail, step 218. 

FIG. 6 shows one embodiment of the parsing using the As an alternative to the logic illustrated in FIG. 7A, a new 

first heuristic. Under this hciuistic, certain user actions are document sequence can be started upon specified actions 

deemed to end a prior document sequence or trail and begin 5 taken by the expert, as well as creating tokens as illustrated 

a new one. These actions may vary, but in the particular in FIG. 7A- 

embodiment in FIG, 6 they include inputting a URL, select- Referring to FIG. 7B, for each document in the trail, step 

ing a bookmark or choosing a URL from an external 220, a vector is generated as a combination of all the tokens, 

application. In theory, these actions likely represent the including the user action tokens, step 222. The vector is 

user's intention to start a new thread or search, since the user normalized in accordance with standard linear algebra 

is jumping to a seemingly unrelated document. Entering a techniques, step 224. This process is performed on each 

new search term into a search engine for a new web- wide document until vectors have been generated for all 

search may similarly be considered the start of a new trail. docimients, step 226. A vector space model is created from 

Conversely, other actions such as selecting hyperlinks or the vectors, step 228, and the documents arc clustered based 

navigating using browser commands arc deemed to likely on the vector space model, step 230. Tokens from each 

represent a continuation of the ongoing thread of interest or cluster are selected to serve as the labels, step 232, in 

search for information. accordance with standard labeling techniques, and a table of 

Thus, referring to FIG. 6, given a usage trail log which contents is generated from the labels, step 234. 

includes the user actions obtained as described above, the As a result of this second method, dociunents which are 

parsing program employing this method loops through the very close in content will still be chistered together even 

usage trail to consider each document, step 170, and reads 20 when an abrupt user action occurred between them, 

the method of user access to each document as stored in the Conversely, documents which are actually unrelated will not 

log, step 172. For each document, the parsing program tests be clustered together just because one was accessed through 

if the document was accessed by direct input of a URL, step activation of a hyperlink in the other. 

174, selection of a bookmark, step 176, or activation of a the invention has been described and illustrated in 

link in an external applicaUon, step 178, and, if any of those 25 connection with preferred embodiments, many variations 

apply, a new document sequence is established, step 180. modifications as vnll be evident to those skilled in this 

Otherwise, the current docmnent is added to the current open may be made without departing from the spirit and scope 

subsequence, step 182. If there are more documents in the invention and the mvenUon ,s thus not to be hmited 

usage trail, step 184, the process continues, until all docu- f the precise details of methodology or construcUon set 

ments have been considered and the usage trail is divided 30 form above as such vanations and modification are intended 

„f ^.u^««,.^« « Tu^ ,««^«!« fu« ^ to be included within the scope of the invention. 

mto a set or subsequences. Ine program then perrorms * ^ ■ a • 

clustering on each subsequence to generate clusters of ^ ^ ? ^* . . /. ^ 

related documents, step 186. Tte cliisters are labeled, step 1- ^ method for conveymg expertise of a first user of a 

188, and the labels and associated documents are stored for <=°n'P"'«'" ^V^^^^ '° ""'^ °' "^"'^ ."^"^ 

later reference, step 190. 3S f «5«'^°i' /»'?P™5 ^^^^^ ^'°™^ " P^^^^^^ 

Hie methodology shown in FIG. 6 provides for a clean, do'^ents usable by the first and second users, the method 

relatively simple and quick way to divide up a usage trail for comprismg. 

clustering. However, it may ako fail to include documents capturmg lofomiation regarding a sequence of documents 

together in a cluster which are otherwise quite closely ^^^^ first user on the computer system; 

related in substance because of the circumstances in which 40 associating a plurality of content areas with a plurality of 

they are accessed, e.g., because the second one happened to sets of documents in the sequence of documents used 

have been accessed by one type of user action rather than by the first user based on the captured information; 

another. This possibility is accounted for in the alternative allowing a second user of the computer system to select 

methodology shown in FIGS. 7A-7B by giving some weight one of the content areas; and 

to user actions in clustering, rather than simply dividing the 45 providing the set of documents associated with a selected 

trail at certain user actions. content area to the second user. 

Beginning with FIG. 7A, given a usage trail log which 2. The method of claim 1, wherein the documents are each 
includes the user actions obtained as described above, the associated with an identifier, and wherein the step of cap- 
parsing program employing this second method loops turing information comprises capturing the identifiers asso- 
through the usage trail to consider each document, step 200, 50 ciated with the documents used. 

and reads the method of user access to each document as 3. The method of claim 1, wherein the documents contain 

stored in the log, step 202, If the URL for the document is content and the step of capturing information comprises 

one of the user's bookmarks, the bookmark name is set as a capturing the content contained in the documents, 

token for clustering, step 206. Otherwise, a token is created 4. The method of claim 1, wherein the documents are 

representing the method of access, step 208. Creating the 55 accessible on the computer system by the first tiser through 

token comprises comparing the access method to a table of a plurahty of different methods, and wherein the step of 

access methods which are set to suggest a topical break. If capturing information comprising capturing the method 

the table comparison suggest a topical break, then a new used by the first user to access each of the documents in the 

token is created which would not pass through the document sequence. 

parser, for example SuggestedClusterN, where N is the 60 5. The method of claim 1, wherein the step of associating 

number of the current cluster. For every document encoun- the plurality of content areas comprises dividing the 

tered in the log until the next cluster break, the token, in this sequence of documents used by the first user into a plurality 

case SuggestedClusterN, is added to the cluster The weight of subsequences of documents based on the captured infor- 

of the token is tuned to affect clustering as one skilled in the mation and associating a content area with each of the 

art would recognized. 65 subsequences of documents. 

In accordance with standard clustering techniques, the 6. The method of claim 5, wherein the step of associating 

document itself is parsed into tokens, step 210, and weights the plurality of content areas comprises identifying a content 
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area to associate with each subsequence based on the 
captured information. 

7. The method of claim 1, wherein the step of allowing the 
second user to select one of the content areas comprises 
presenting the plurality of content areas to the user as a 5 
plurality of selectable choices. 

8. The method of claim 1, wherein the step of allowing the 
second user to select one of the content areas comprises 
accepting a query from the second user about a desired 
subject and matching the queried subject with one of the lO 
plurahty cf content areas. 

9. The method of claim 1, wherein the step of providing 
the set of documents associated with the selected content 
area to the second user comprises providing the documents 

in the set in the sequence in which the documents in the set 15 
were used by the first user. 

10. The method of claim 1, wherein the computer system 
includes one or more servers storing the plurality of docu- 
ments aiid one or more cHents used by the first and second 
users, and wherein the step of capturing information regard- 20 
ing the sequence of documents used by the first user com- 
prises monitoring the first user's use of the sequence of 
docimients on a first client, transmitting information about 
the monitored use of the documents to a first server, and 
storing the transmitted information on the first server. 25 

11. The method of claim 10, wherein the first server 
performs the step of associating the content areas with the 
sets of documents and performs the step of providing the set 
of documents to the second user in response to the selection 

of the content area by the second user using a second client. 30 

12. The method of claim 10, wherein the computer system 
comprises a distributed computing environment in which 
documents are contained in one or more databases stored on 
the one or more servers, wherein each of the docimients is 
identified by database location data, and wherein the step of 35 
transmitting information about the monitored use of the 
documents comprises transmitting the database location data 
identifying the documents. 

13. The method of claim 10, wherein the computer system 
comprises at least part of the world wide web in which 40 
documents stored on one or more web servers are each 
identified by a URL, and wherein the step of transmitting 
information about the monitored use of the documents 
comprises transmitting the URLs identifying the documents. , 

14. The method of claim 13, wherein the documents are 45 
accessible on the computer system by the first user through 

a plurality of different methods including input of the URL, 
selection of a stored bookmark, and activation of a 
hyperlink, and wherein the step of capturing information 
comprises capturing the method used by the first user to 50 
access each of the documents in the sequence. 

15. The method of claim 14, wherein the step of associ- 
ating the plurality of content areas comprises dividing the 
sequence of documents used by the first user into a plurality 

of subsequences of documents where eadi subsequence 55 
comprises a first document in the subsequence accessed 
through the first user's selection of a bookmark or entry of 
a URL and one or more second documents following the first 
document in the subsequence accessed through activation of 
one or more hyperhnks. 60 

16. The method of claim 15, wherein the plurahty of 
different methods through which documents are accessible 
on the computer system by the first user through further 
include navigation to a document using navigational func- 
tions of a browser, and wherein the step of dividing the 65 
sequence of documents used by the first user into subse- 
quences comprises including a document in a previously 
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Started subsequence when the document is accessed through 
the first user's navigation to the document using naviga- 
tional finctions of the browser. 

17. The method of claim 1, wherein the documents 
contain content, wherein the step of capturing information 
comprises capturing the content in the sequence of 
documents, and wherein the step of associating content areas 
comprises clustering documents in the sequence into sets 
based on the content in the documents. 

18. The method of claim 17, wherein the documents are 
accessible on the computer system by the first user through 
a plurality of different methods, wherein the step of captur- 
ing information comprising capturing the method used by 
the first user to access each of the documents in the 
sequence, and wherein the step of clustering documents in 
the sequence comprises clustering documents into sets based 
on the content in the documents and the method used to 
access each document. 

19. The method of claim 1, comprising: 

capturing information regarding a sequence of documents 
used by each of plurality of first users on the computer 
system; 

associating each sequence of documents with one of the 
first users; and 

associating a plurahty of content areas with a plurality of 
sets of documents in each of the sequence of documents 
based on the captured information, thereby associating 
a plurality of content areas with each of the first users. 

20. The method of claim 19, comprising allowing the 
second user to select one of the plurality of first users and 
allowing the second user to select a content area from the 
plurahty of content areas associated with the selected first 
user. 

21. The method of claim 19, comprising presenting to the 
second user a list containing the one or more first users 
associated with a content area selected by the second user. 

22. The method of claim 21, comprising allowing the 
second user to select one of the first users from the list of first 
users associated with the selected content area. 

23. The method of claim 19, wherein the plurality of first 
users ore organized in a plurahty of groups, comprising 
allowing the second user to select one of the plurality of first 
user groups and allowing the second user to select a content 
area associated with one or more of the first users in the 
selected first user group. 

24. A method for conveying expertise of a plurahty of first 
users of a computer system to one or more second users of 
the computer system, the computer system storing a plural- 
ity of documents usable by the first and second users, the 
method comprising: 

capturing information regarding a sequence of documents 
used by each of the first users on the computer system; 

storing the captured information regarding each sequence 
of documents in association with the first user; 

allowing a second user to select one of the first users; and 

providing the sequence of documents associated with a 
selected first user to the second user. 

25. A system for conveying expertise of a first user to one 
or more second users, wherein a plurality of documents are 
accessible for use by the first and second users, the method 
comprising: 

a logger for capturing information regarding a sequence 
of documents used by the first user on the computer 
system; 

a parser for associating a plurality of content areas with a 
plurahty of sets of documents in the sequence of 
documents used by the first user based on the captured 
information; 
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means for accepting a selection by a second user of one 

of the content areas; and 
means for providing the set of documents associated with 

a selected content area to the second user. 

26. A computer readable medium storing one or more 5 
software programs which, when executed, cause a computer 

to perform a method for conveying expertise of a first user 
of a computer system to one or more second users of the 
computer system, the computer system storing a phirality of 
docim^ents usable by the first and second users, the method 10 
comprising: 

capturing infomiation regarding a sequence of documents 
used by the first user on the computer system; 

associating a plurality of content areas with a plurality of 
sets of documents in the sequence of documents used 
by the first user based on the captured information; 

allowing a second user of the computer system to select 
one of the content areas; and 

providing the set of documents associated with a selected 20 
content area to the second user. 

27. The computer readable mediimi of claim 26, wherein 
the documents are each associated with an identifier, and 
wherein the step caused to be performed by the one or more 
software programs of capturing information comprises cap- 25 
turing the identifiers associated with the documents used. 

28. The computer readable meditim of claim 26, wherein 
the documents contain content and the step caused to be 
performed by the one or more software programs of cap- 
turing information comprises capturing the content con- 30 
tained in the documents. 

29. The computer readable medium of claim 26, wherein 
the documents are accessible on the computer system by the 
first user through a plurality of different methods, and 
wherein the step caused to be performed by the one or more 35 
software programs of capturing information comprising 
capturing the method used by the first user to access each of 
the documents in the sequence. 

30. The computer readable medium of claim 26, wherein 
the step caused to be performed by the one or more software 40 
programs of associating the plurality of content areas com- 
prises dividing the sequence of documents used by the first 
user into a plurality of subsequences of documents based on 
the captured information and associating a content area with 
each of the subsequences of documents. 45 

31. The computer readable medium of claim 30, wherein 
the step caused to be performed by the one or more software 
programs of associating the plurality of content areas com- 
prises identifying a content area to associate with each 
subsequence based on the captured information. 50 

32. The computer readable meditim of claim 26, wherein 
the step caused to be performed by the one or more software 
programs of allowing the second user to select one of the 
content areas comprises presenting the plurality of content 
areas to the user as a plurality of selectable choices. 55 

33. The computer readable medium of claim 26, wherein 
the step caused to be performed by the one or more software 
programs of allowing the second user to select one of the 
content areas comprises accepting a query from the second 
user about a desired subject and matching the queried 60 
subject with one of the plurality of content areas. 

34. The computer readable medium of claim 26, wherein 
the step caused to be performed by the one or more software 
programs of providing the set of documents associated with 
the selected content area to the second user comprises 65 
providing the documents in the set in the sequence in which 
the documents in the set were used by the first user. 
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35. The computer readable medium of claim 26, wherein 
the computer system includes one or more servers storing 
the plurality of documents and one or more clients used by 
the first and second users, and wherein the step caused to be 
performed by the one or more software programs of cap- 
turing information regarding the sequence of documents 
used by the first user comprises monitoring the first user's 
use of the sequence of documents on a first client, transmit- 
ting information about the monitored use of the documents 
to a first server, and storing the transmitted information on 
the first server. 

36. The computer readable medium of claim 35, wherein 
the first server executes the one or more software programs 
which caused to be performed the steps of associating the 
content areas with the sets of documents of providing the set 
of documents to the second user in response to the selection 
of the content area by the second user using a second client. 

37. The computer readable mcditim of claim 35, wherein 
the computer system comprises a distributed computing 
environment in which documents are contained in one or 
more databases stored on the one or more servers, wherein 
each of the documents is identified by database location 
data, and wherein the step caused to be performed by the one 
or more software programs of transmitting information 
about the monitored use of the documents comprises trans- 
mitting the database location data identifying the docu- 
ments. 

38. The computer readable medium of claim 35, wherein 
the computer system comprises at least part of the world 
wide web in which documents stored on one or more web 
servers are each identified by a URL, and wherein the step 
caused to be performed by the one or more software pro- 
grams of transmitting information about the monitored use 
of the documents comprises transmitting the URLs identi- 
fying the documents. 

39. The computer readable meditim of claim 38, wherein 
the documents are accessible on the computer system by the 
first user through a plurality of different methods including 
input of the URL, selection of a stored bookmark, and 
activation of a hyperlink, and wherein the step caused to be 
performed by the one or more software programs of cap- 
turing information comprises capturing the method used by 
the first user to access each of the documents in the 
sequence. 

40. The computer readable medium of claim 39, wherein 
the step caused to be performed by the one or more software 
programs of associating the plurality of content areas com- 
prises dividing the sequence of documents used by the first 
user into a plurality of subsequences of documents where 
each subsequence comprises a first document in the subse- 
quence accessed through the first user's selection of a 
bookmark or entry of a URL and one or more second 
documents following the first document in the subsequence 
accessed through activation of one or more hyperlinks. 

41. The computer readable medium of claim 40, wherein 
the plurality of different methods through which documents 
are accessible on the computer system by the first user 
through further include navigation to a document using 
navigational functions of a browser, and wherein the step 
caused to be performed by the one or more software pro- 
grams of dividing the sequence of documents used by the 
first user into subsequences comprises including a document 
in a previously started subsequence when the document is 
accessed through the first user's navigation to the document 
using navigational functions of the browser. 

42. The computer readable medium of claim 26, wherein 
the documents contain content, wherein the step caused to be 
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performed by the one or more software programs of cap- 
turing informalioQ comprises capturing the content in the 
sequence of documents, and wherein the step caused to be 
performed by the one or more software programs of asso- 
ciating content areas comprises clustering documents in the 
sequence into sets based on the content in the documents. 

43. The computer readable medium of daim 42, wherein 
the documents are accessible on the computer system by the 
first user through a plurality of diflfcrcnt methods, wherein 
the step caused to be performed by the one or more software 
programs of capturing information comprising capturing the 
method used by the first user to access each of the docu- 
ments in the sequence, and wherein the step caused to be 
performed by the one or more software programs of clus- 
tering documents in the sequence comprises clustering docu- 
ments into sets based on the content in the documents and 
the method used to access each document. 

44. ITie computer readable medium of claim 26, wherein 
the one or more software programs cause the computer to 
execute additional steps comprising: 

capturing information regarding a sequence of docoments 
used by each of plurality of first users on the computer 
system; 

associating each sequence of documents with one of the 
first users; and 

associating a plurality of content areas with a plurality of 
sets of documents in each of the sequence of documents 
based on the captured information, thereby associating 
a plurality of content areas with each of the first users. 

45. The computer readable medium of claim 44, wherein 
the one or more software programs cause the computer to 
allow the second user to select one of the plurality of first 
users and allow the second user to select a content area from 
the plurality of content areas associated with the selected 
first user. 
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46. The computer readable medium of claim 44, wherein 
the one or more software programs cause the computer to 
present to the second user a list containing the one or more 
first users associated with a content area selected by the 
second user. 

47. The computer readable medium of claim 46, wherein 
the one or more software programs cause the computer to 
allow the second user to select one of the first users firom the 
list of first users associated with the selected content area. 

48. The computer readable medium of claim 46, wherein 
the plurality of first users are organized in a plurality of 
groups, wherein the one or more software programs cause 
the computer to allow the second user to select one of the 
plurality of first user groups and allow the second user to 
select a content area associated with one or more of the first 
users in the selected first user group. 

49. A computer readable medium storing one or more 
software programs which, when executed, cause a computer 
to perform a method for conveying expertise of a plurality 
of first users of a computer system to one or more second 
users of the computer system, the computer system storing 
a plurality of documents usable by the first and second users, 
the method comprising: 

capturing information regarding a sequence of documents 
used by each of the first users on the computer system; 

storing the captured information regarding each sequence 
of documents in association with the first user, 

allowing a second user to select one of the first users; and 

providing the sequence of documents associated with a 
selected first user to the second user. 
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